Hyperplane Training of a Hypersphere Classifier
نویسنده
چکیده
A novel classifier architecture is introduced which belongs to both hyperplane and hypersphere families. The basic computational unit in the architecture is a perceptron whose input is augmented by its squared length. Traditional methods of training hyperplane classifiers (perceptron training algorithm, backpropagation, etc.) function in the augmented input space, and induce hyperspherical decision regions in the original input space. The multilayer architecture based on these units includes, as specific cases, the multilayer perceptron and the radial basis function networks. 1 Hypersphere and Hyperplane Classifier Architectures One conventionally distinguishes between three types of statistical pattern classifiers, namely example based classifiers (eg. k nearest neighbors), hypersphere classifiers (such as the radial basis functions or RBF network) (Broomhead and Lowe 1988), and hyperplane classifiers (such as the multilayer perceptron or MLP network) (Rumelhart and McClelland 1986). Example based classifiers may not require training, but suffer from large memory requirements, long classification times, and do not, in general, attain the minimal possible (Bayes) classification error. They will not be considered further here. Hypersphere classifiers have modest training and classification times, excellent false alarm rejection rates and can attain Bayes error with large training sets. Hyperplane classifiers may require longer training times but classify faster. They also attain minimal error, but have virtually no inherent false alarm rejection capabilities (Stein et al 1993). While hypersphere classifiers endeavor to capture the class probability distributions, hyperplane classifiers only try to find inter-region boundaries. Hyperplane classifiers have proven to be more popular than hypersphere ones in practice, for several reasons. The simplest hyperplane classifier, the perceptron, can be trained in a finite number of steps, at least when a separating hyperplane exists. The multilayer perceptron can create arbitrary decision regions, and tends to have somewhat lower misclassification rate than hypersphere classifiers for small training sets, due to the more efficient use of examples during training. The most popular MLP training algorithms are variants of backpropagation, which do not usually converge to a global optimum, but are straightforward to implement. RBF training methods either call for a clustering stage, or arbitrarily chose a small number of input examples as bases. It would thus be beneficial to combine the best features of hypersphere and hyperplane classifiers. Such a combined classifier would have simple training procedures as well as low misclassification and false alarm rates. In the sequel we propose such an architecture, which can be implemented by making only minimal changes to existing MLP systems. The connection between neurons which compute the norm of a differences and those which compute compute inner products has been studied previously (Seligson et al 1992) with the objective of replacing the latter with the former. That work proved formal equivalence for neurons with binary input and output, and demonstrated empirically the inferiority of the difference neuron for other cases. The present work is complementary in the sense that difference like decision regions are obtained by exploiting product neurons. 2 The Augmented Perceptron A tactic often employed by pattern recognition practitioners is to add auxiliary variables to the input. Such auxiliary variables are produced by combining the original input variables in ways that the classifier itself can not. For the simple hard-limiting perceptron, which classifies an input pattern as positive or negative based on linear combinations of the input variables xi N ∑
منابع مشابه
Incremental training of support vector machines using hyperspheres
In the conventional incremental training of support vector machines, candidates for support vectors tend to be deleted if the separating hyperplane rotates as the training data are added. To solve this problem, in this paper, we propose an incremental training method using one-class support vector machines. First, we generate a hypersphere for each class. Then, we keep data that exist near the ...
متن کاملIris Recognition Using Modified Fuzzy Hypersphere
In this paper we describe Iris recognition using Modified Fuzzy Hypersphere Neural Network (MFHSNN) with its learning algorithm, which is an extension of Fuzzy Hypersphere Neural Network (FHSNN) proposed by Kulkarni et al. We have evaluated performance of MFHSNN classifier using different distance measures. It is observed that Bhattacharyya distance is superior in terms of training and recall t...
متن کاملAlternative Representations for Artificial Immune Systems
Artificial Immune Systems (AIS) have been proposed to solve binary classification problems; distinguishing between instances of self and of non-self. For any classification system such as AIS, the choice of classifier representation used impacts substantially on the kind of classification problem that can be handled. Despite this, most classication systems such as AIS make use of only one repre...
متن کاملPerturbation analysis for circles, spheres, and generalized hyperspheres fitted to data by geometric total least-squares
A continuous extension of the objective function to a projective space guarantees that for each data set there exists at least one hyperplane or hypersphere minimizing the average squared distance to the data. For data sufficiently close to a hypersphere, as the collinearity of the data increases, so does the sensitivity of the fitted hypersphere to perturbations of the data.
متن کاملK-SVM: An Effective SVM Algorithm Based on K-means Clustering
Support Vector Machine (SVM) is one of the most popular and effective classification algorithms and has attracted much attention in recent years. As an important large margin classifier, SVM dedicates to find the optimal separating hyperplane between two classes, thus can give outstanding generalization ability for it. In order to find the optimal hyperplane, we commonly take most of the labele...
متن کامل